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ABSTRACT 


Let  X(1),X(2),  ....  X(n)  be  independent  random  variables  such  that 


, 1 tr 


P(x(i)  — j)  — P.  > j - 1»2,  • ••>  n » I P-  ~ 1 


J 

^ 5u!j 


and  consider  a graph  with  n nodes  numbered  1,2,  •••,  n and  the 
arcs  (i,X(i) ) , i=l,2,  ...,n.  We  determine  the  probability  that 
the  above  so-called  random  graph  is  connected  and  then  develop  a re- 
cursive formula  for  the  distribution  of  C , the  number  of  connected 
components  it  contains.  We  also  derive  expressions  for  the  mean 
and  variance  of  C . 


v 
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1.  MAIN  RESULTS 


Before  obtaining  the  desired  probability  that  the  graph  is  connected 
we  shall  consider  a related  problem  having  r + 1 nodes  - 0,1,  ...,  r - 
and  r arcs  (i,Y(i))  , i * 1,2,  ...,  r , where  the  Y^  are  independent 


and  such  that  P{Y 


i ‘ 11  • Qj  • j * 0>1’  ••••  c ’ ‘ 1 ■ 


We  then 


have  the  following  proposition. 


Proposition  1: 


In  the  related  problem 


P{ graph  is  connected}  ■ Qr 


Proof : 


The  proof  is  by  induction  on  r and  as  it  is  obvious  for  r * 1 
assume  the  result  for  all  values  less  than  r . Now  in  the  case  under 
consideration  condition  on  the  set  of  random  variables  Y(i)  which  equal 
0 to  obtain 


P{ graph  is  connected} 


l Qn  1 (1  - QJ  S P{ connected  | Y(i)  - 0 , i e S , Y(i)  + 0 , i e SC} 

Sc  {1,2 r}  U 0 


where  |s|  denotes  the  cardinality  of  S and  Sc  the  complement  of  S . 
Now  given  that  Y(i)  » 0 for  i e S and  Y^  + 0 for  i e Sc  the  situa- 
tion (as  far  as  the  graph  being  connected)  is  the  same  as  if  we  had 
| Sc | +1  nodes  and  | Sc | arcs  with  each  arc  going  into  node  0 with 

probabilitv  \ 0./(l  - Qn)  . Hence  bv  the  induction  hypothesis  we  have 

ieS 


p{ connected  1 Y(i)  =*  0 , i £ S , Y + 0 , i e SC} 

- I Q^/tt  - Q0} 

ieS 


and  so 


P{ graph  is  connected} 


iH 


sla-q0)'s 


1 Qil 

ieS  LJ 


1 - Qr 


l Qt 
Li:Y(i)-0  \ 


1 - Qr 


[ k vj 


where 
i - Qq  i 


(l  lf  ^ - 0 

|o  if  Y + 0 


l 


Consider  now  the  original  problem  with  n nodes  and  arcs  (i,X(i)) 
i * 1,2,  ....  n . Starting  at  some  node  - say  node  1 - consider  the 
sequence  of  nodes  1,X(1) ,X^ (1) , ...  where  Xn(l)  * X(Xn  1(1))  ; and 
define  N by 


N - smallest  k : Xk(l)  e {1,X(1),  ....  Xk-1(l)} 


and  define  W by 


•aLHiTIrr  ~ 


t ) 


N-l 

W - P,  + l P k 

lc-1  X (1) 

In  other  words  N is  the  number  of  nodes  reached  in  the  sequence 
2 

1,X(1),X  (1),  ...  before  a node  appears  twice  and  W is  the  sum  of  the 
probabilities  of  these  nodes.  We  now  have 

Theorem  1 : 

P{ graph  is  connected  | W}  = W . 

Proof : 

Conditioning  on  W and  N the  problem  reduces  to  the  related  problem 
and  the  result  follows  from  Proposition  1. 

Hence  we  have 

Corollary  1: 

P{ graph  is  connected}  = E(W)  . 

In  other  words  if  a sequence  of  independent  trials  each  resulting 
in  one  of  n possible  outcomes  with  probabilities  P^,  ....  P^  are 
performed  then,  given  that  the  initial  outcome  is  outcome  1,  the  expected 
sum  of  the  probabilities  of  all  the  distinct  outcomes  obtained  before  any 
outcome  has  been  repeated  twice  is  equal  to  the  probability  of  the  graph 
being  connected.  It  is  interesting  to  note  that  as  we  could  have  begun 
with  any  of  the  n outcomes  it  follows  that  the  expected  sum  obtained  is 
independent  of  the  initial  outcome;  a result  wf.ich  is  not  all  apparent. 
Hence  if  we  assume  that  the  initial  outcome  is  also  randomly  determined 


we  have  that 
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Corollary  2: 


P{ graph  is  connected} 


- l ?2.  Ti  + i p + i i 

i L j?i  2 k^i,j 


PiPk  + l l l 

2 j^i  k^i,  j Jtfi,  j ,k 


P . P.P,  + • • • 

j k l 


Proof : 

The  term  inside  the  sum  is  just  P multiplied  by  the  probability 
of  a type  i outcome  before  any  of  the  other  outcomes  have  occurred 
twice. | | 


A graph  is  said  to  consist  of  r connected  components  if  its  nodes 
can  be  partitioned  into  r subsets  so  that  each  of  the  subsets  are 
connected  and  there  are  no  arcs  between  nodes  in  different  subsets.  Let 
C denote  the  number  of  connected  components  of  the  random  graph 
(i,X(i))  , i * 1,  ....  n ; and  let 

fj(P)  = P{C  - j)  , j = 1,2,  ...,  n 


where  we  use  the  notation  f (P)  to  make  explicit  the  dependence  on  the 


probability  vector  P^  = (P^,  ...»  P^) 


Now 


f^(P)  = P{C  = 1}  = P{graph  is  connected} 


can  be  obtained  from  Corollary  2.  To  determine  ^ » t*ie  prob- 

ability of  exactly  2 components  fix  attention  on  some  particular 
node  - say  node  1.  In  order  that  a given  set  of  nodes  containing  node  1 

- call  it  S - will  constitute  one  connected  component  and  the  remaining 

c 

nodes  S a second  connected  component  we  must  have 


i. 
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(i)  X(i)  e S , i e S 

(ii)  X(i)  e Sc  , i e Sc 

(iii)  The  nodes  in  S form  a connected  subgraph 

(iv)  The  nodes  in  SC  form  a connected  subgraph. 


Hence  the  probability  of  2 connected  components  is  given  by 


■ l [(is  J,  '■)' 


'lS'f1(P(S))f1(P(SC)) 


where  the  sum  is  over  all  the  subsets  S containing  node  1 and  the  i 

component  of  £(S)  is  equal  to  P . / [ P.  if  ieS  and  is  0 if  i i S 

1 jcS  J 

and  similarly  for  P(SC)  . In  general  the  recursive  formula  for  f ^ ( P ) 
is  given  by 


■ s U,  j,  '.>■ 


_lS'f1(P(S))fj_1(P(SC)) 


where  again  the  sum  is  over  all  subsets  of  {1,2,  ...,  n}  which  contain 
node  1. 

The  expected  number  of  connected  components  can  most  easily  be 
computed  by  first  noting  that  every  connected  component  will  contain  exactly 
one  cycle.  This  is  most  easily  seen  by  noting  that  a connected  component 
having  r nodes  will  also  have  r arcs  and  thus  exactly  one  cycle.  Hence 


E(C)  = E(Number  of  Cycles) 


0 l(s>] 


where  I(S) 


1 if  nodes  in  S 


constitute  a cvcle 


0 otherwise 


y ( i s i - 1>!  / tt  p \ 

S \ ieS  J/ 


where  the  sum  is  over  all  nonempty  subsets  of  (1,2,  ....  n}  . 

The  variance  of  C can  also  be  obtained  by  using  the  same 
representation.  We  obtain 


Var  (C)  = Var  ^ I (S)J 

* l Var  [I(S) ] + H Cov  [I(S),I(S')]  . 
S S,S' 

S'^S 


Now  for  S ^ S'  , 


10  if  sns’  H 

E[I(S)I(S* ) ] = 

(e[I<S)]E[I(S')]  if  s n S'  = 0 


and  thus 


(4)  Var  (C)  = £ E[I(S)](1  - E[I(S) ] ) - £ E[I(S)]  l E[I(S')] 
S S SVS 

s 'ns#0 


Now, 


V I (S')  = Number  of  cycles  having  a node  in  S 
S'OS^0 


and  thus 


I 

s'ns^0 

s'^s 


I(S') 


= E(C)  - E[I(S) ] - E 


U= I(s,)] 


Hence  from  (4)  we  obtain 
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(5)  Var  (C)  = £ Efl(S)  ] - E[I(S)]12  + [[  E[I(S)  ]E[I(S')  ] 

S |_S  J S,S’CSC 


where 


E[  I (S)  ] = '1S|  - 1)!  tr  P.  . 

jeS  ^ 


1 


2.  THE  SPECIAL  CASE  P.  = 1/n 


In  the  special  case  where  P_.  = 1/n  for  all  j we  have  from 


Corollary  2 that 


^ [l  + 
n n 


P{ graph  is  connected} 


(n  - 1)  (n  - 2)  + ...  , (n  - D ! 


■ - 1)!' 

n-1 

n 


i + nf  r-fa-1?1  .] 

.1=1  L(n  - j - 1)  !nJJ 


(n-1)! 


n i=0 


± 

l n / i ! (by  letting  i = n - j] 


which  agrees  with  the  results  given  in  [1],  [3]  and  [4].  In  addition 
letting  Pn(j)  = ~ » •••»  = P{C  = j}  the  recursive  formulae 


(1)  and  (2)  become 


and,  in  general. 


?n(^ 


T c : - 


Whereas  an  explicit  expression  for  P^Cj)  in  terms  of  Stirling 
numbers  has  been  previously  derived  in  (1)  the  above  recursive  equations 
appears  to  be  new.  In  this  special  case  the  formula  (3)  for  E(C) 
simplifies  to 


f 


10 


n / \ 

i,  (!) 


(k  - 1) ! /n 


which  had  previously  been  obtained  (see  [1]  or  [5])  in  a much  more 
involved  manner.  In  addition,  from  (5)  we  have 


var  (c>  - - iu,ak  -(X(>  - i),tf 


which  appears  to  be  new. 


n / 
v h 

\ (k  - 1)!  n;k 

, \ k 

1 k L 

k-1  ' 

’ n j-1 

new. 
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3.  SCHUR  FUNCTIONS 


We  say  that  the  vector  £ = (P^,  •••»  P ) majorizes  the  vector 


£ ■ (Q^,  ....  Q ) , written  £ >_  (£  if 

m 


i i 

1 Q.-j)  » i = 1»  ....  n - 1 

j-1  j-1 


3h  p«> ' 3h  Q«> 


. th 


where  P(j)  and  ^(j)  are  resPectively  the  j largest  values  of 

P,  , . . . , P and  Q,  , . . . , Q 
1 n 1 n 

A permutation  invariant  function  H(P)  is  said  to  be  a Schur  convex 


function  if  H(£)  >_  H(^)  whenever  P > Q . It  is  well  known  (see  [2]) 

m 

that  a sufficient  condition  for  H to  be  Schur  convex  is  that 


<pi  - V 


^ »®]  i ° 


for  all  P , P9 


It  is  straightforward  to  verify  that  the  expression  presented  in 

Corollary  2 for  the  probability  that  the  graph  is  connected  is  thus  Schur 

convex  and  from  Equation  (3)  that  -E(C)  is  Schur  convex.  Since  every 

probability  vector  P = (P  , P ) majorizes  the  vector  — | 

— 1 n \nn  n/ 

it  follows  that  the  probability  that  the  graph  is  connected  is  minimized 

and  the  expected  number  of  components  is  maxirized  by  the  results  in  the 

uniform  case. 


REFERENCES 


Frank,  H.  and  I.  Frisch,  COMMUNICATION,  TRANSMISSION  AND  TRANSPORTATION 
NETWORKS,  Addison-Wesley , 1971. 

Hardy,  G.  H. , J.  E.  Littlewood  and  G.  Polye,  INEQUALITIES,  Cambridge 
University  Press,  Cambridge,  MA,  1952. 

Harris,  B. , "Probability  Distributions  Related  to  Random  Mappings," 

Annals  of  Mathematical  Statistics,  Vol.  31,  pp.  1045-1062,  (1960). 

Katz,  I.,  "Probability  of  Indecomposability  of  a Random  Mapping 

Function,"  Annals  of  Mathematical  Statistics,  Vol.  26,  pp.  512-517, 
(1955). 

Kruskal,  M.  D. , "The  Expected  Number  of  Components  Under  a Random 
Mapping  Function,"  American  Mathematical  Monthly,  Vol.  61, 
pp.  392-397,  (1954). 


